pacman::p_load(sf, readr, dplyr, ggplot2, knitr, mapview, spdep, tmap, tidyverse)Dynamic Patterns of Public Transport Usage
A Geospatial Analysis of Bus Stop Passenger Volume in Urban Environments
1.1 Overview
In the intricate mosaic of urban transportation, bus stops serve as pivotal nodes that capture the pulse of city life through the ebb and flow of passenger trips. The study of passenger trip traffic, particularly during peak hours, becomes essential for enhancing service efficiency, planning urban infrastructure, and improving the overall commuter experience. This geospatial analysis is anchored in the bustling landscape of a metropolitan area, where data on bus stops, resident population distribution, urban development plans (master plan sub-zones), and passenger volume intertwine to paint a comprehensive picture of transit dynamics.
This analysis aims to dissect the rhythms of urban mobility with the following objectives:
Geovisualization and Analysis
- Compute the passenger trips generated by origin at the hexagon level
- Visualize the geographical distribution of the passenger trips
Local Indicators of Spatial Association (LISA) Analysis
- Compute LISA of the passengers trips generate by origin at hexagon level
- Visualize the LISA maps of the passengers trips generate by origin at hexagon level
Emerging Hot Spot Analysis (EHSA)
- Perform Mann-Kendall Test by using the spatio-temporal local Gi* values
- Visualize EHSA maps of the Gi* values of the passenger trips by origin at the hexagon level
1.2 Load packages
sf imports geospatial data and readr imports csv file.
tidyverse performs data import, wrangling and visualization. dplyr perform relational join.
spdep compute spatial weights and calculate spatially lagged variables.
1.3 Import data
We will be using the following geospatial (busstops, mpsz) and aspatial (odbus, pop) datasets:
busstops contains the detailed information for all bus stops currently serviced by buses, including bus stop code, road name, description, location coordinates.
Source: LTA DataMall (Postman URL)
busstops <- st_read(dsn = "data/BusStopLocation_Jul2023", layer = "BusStop") %>%
st_transform(crs = 3414)Reading layer `BusStop' from data source
`/Users/chockwankee/Documents/chockwk/ISSS624_Geospatial_Analytics/Take_home_Ex/Take_home_Ex01/data/BusStopLocation_Jul2023'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
The output indicates that the geospatial objects are point features. There are 5161 features and 3 fields. It is in SVY21 projected coordinates system with XY dimension.
odbus contains the number of trips by weekdays and weekends from origin to destination bus stops. It reflects the passenger trip traffic for October 2023.
Source: LTA DataMall (Postman URL)
odbus = read_csv("data/origin_destination_bus_202310.csv")The output does not indicate any geospatial objects. There are 5,694,297 records and 7 fields.
pop contains Singapore residents grouped by planning Area or subzone, age group, sex and floor area of residence. The data period is from June 2011 onwards. From June 2011 to 2019, the planning areas refer to areas demarcated in the Master Plan 2014, and from June 2020 onwards will be Master Plan 2019.
| x | Description |
|---|---|
| PA | Planning area |
| SZ | Subzone |
| AG | Age group |
| Sex | Sex |
| FA | Residence floor area |
| Pop | Resident count |
| Time | Time/period |
Source: Department of Statistics (Link)
pop <- read_csv("data/respopagesexfa2011to2020/respopagesexfa2011to2020.csv")mpsz is the Master Plan 2019, a forward looking guiding plan for Singapore’s development in the medium term over the next 10 to 15 years published in 2019. Note this mpsz differs from that in previous chapter, Data Wrangling.
Source: URA (Download here)
mpsz = st_read(dsn="data/MPSZ-2019", layer="MPSZ-2019") %>%
st_transform(crs=3414)Reading layer `MPSZ-2019' from data source
`/Users/chockwankee/Documents/chockwk/ISSS624_Geospatial_Analytics/Take_home_Ex/Take_home_Ex01/data/MPSZ-2019'
using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
The output indicates that the geospatial objects are multipolygon features. There are 332 features and 6 fields. It is in WGS84 projected coordinates system with XY dimension.
# Assuming mpsz is your sf object
#mpsz <- st_read(dsn="data/MP19_SUBZONE", layer="MP19_SUBZONE_NO_SEA")
# Ensure that 'geometry' is the active geometry column
#mpsz <- st_set_geometry(mpsz, "geometry")
# Check for invalid geometries and apply st_make_valid if needed
#mpsz$geometry_is_valid <- st_is_valid(mpsz$geometry)
#mpsz <- mpsz %>%
# mutate(geometry = ifelse(!geometry_is_valid, st_make_valid(geometry), geometry))
# Now, remove the Z dimension explicitly if it exists
#mpsz <- st_zm(mpsz, drop = TRUE, what = "Z")
# View the data structure to confirm changes
#print(mpsz1)Warning: GDAL Message 1: Sub-geometry 0 has coordinate dimension 2, but container has 3
The warning message indicate that a discrepancy in the dimensionality of the geometries in mpsz Shapefile. Some of the sub-geometries have 2D coordinates (X and Y), while the overall container expects 3D coordinates (X, Y, and Z). The st_zm(what = "Z") function drops the Z dimension from the geometries, which eliminate the warnings about coordinate dimensions. We are not performing 3D analysis, this approach should work fine for most applications.
1.4 Create Spatial Grids
Spatial grids are commonly used in spatial analysis to divide the study area into equal size, regular polygons that tessellate the area of interest. On the selection of grid, regular geographic units, such as square grid or fishnets, rarely reflect real world situations. Hexagons are compact shape and can overcome oddly-shaped geographical units.
Step 1: Create a hexgonal grid
hexagon is a hexagon layer of 250m where the distance is the perpendicular distance between the centre of the hexagon and its edges. It is a substitute for ‘mpsz’ which is relatively coarse and irregular.
Step 1
# Create hexagonal grid (250m from center to edges) and add grid ID.
area_hexagon_grid = st_make_grid(busstops, cellsize = 500, what = "polygons", square = FALSE)
hexagon_grid_sf = st_sf(area_hexagon_grid) %>%
mutate(grid_id = 1:length(lengths(area_hexagon_grid)))
hexagon_grid_sfSimple feature collection with 5580 features and 1 field
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 3470.122 ymin: 26193.43 xmax: 48720.12 ymax: 53184.55
Projected CRS: SVY21 / Singapore TM
First 10 features:
area_hexagon_grid grid_id
1 POLYGON ((3720.122 26626.44... 1
2 POLYGON ((3720.122 27492.46... 2
3 POLYGON ((3720.122 28358.49... 3
4 POLYGON ((3720.122 29224.51... 4
5 POLYGON ((3720.122 30090.54... 5
6 POLYGON ((3720.122 30956.57... 6
7 POLYGON ((3720.122 31822.59... 7
8 POLYGON ((3720.122 32688.62... 8
9 POLYGON ((3720.122 33554.64... 9
10 POLYGON ((3720.122 34420.67... 10
The output indicates that the geospatial objects are polygon features. There are 5580 features and 1 fields. It is in SVY21 projected coordinates system with XY dimension.
Step 2
# Count the number of bus stops in each grid and remove grids without any value.
hexagon_grid_sf$grid_id = lengths(st_intersects(hexagon_grid_sf, busstops))
hexagon_count = filter(hexagon_grid_sf, grid_id > 0)
hexagon_countSimple feature collection with 1524 features and 1 field
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 3720.122 ymin: 26193.43 xmax: 48720.12 ymax: 53184.55
Projected CRS: SVY21 / Singapore TM
First 10 features:
area_hexagon_grid grid_id
1 POLYGON ((3970.122 27925.48... 1
2 POLYGON ((4220.122 28358.49... 1
3 POLYGON ((4470.122 30523.55... 1
4 POLYGON ((4720.122 28358.49... 1
5 POLYGON ((4720.122 30090.54... 2
6 POLYGON ((4720.122 30956.57... 1
7 POLYGON ((4720.122 31822.59... 1
8 POLYGON ((4970.122 28791.5,... 1
9 POLYGON ((4970.122 29657.53... 1
10 POLYGON ((4970.122 30523.55... 2
The output indicates that the geospatial objects retained polygon features and there are more than one polygon feature in a grid_id. The intersection of busstops and hexagon_grid_sf is 1524 features and 1 fields, indicating only 1524 out of 5580 features contains bus stops. It is in SVY21 projected coordinates system with XY dimension.
Step 3
# Set tmap to view mode for interactive plotting.
tmap_mode("view")
hexagon = tm_shape(hexagon_count)+
tm_fill(
col = "grid_id",
palette = "Reds",
style = "cont",
title = "Number of Bus Stops in Singapore",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.format = list(
grid_id = list(format = "f", digits = 0)
)
)+
tm_borders(col = "grey40", lwd = 0.7)
hexagonThe geospatial distribution of bus stops in Singapore, as visualized in the above map, is extensive and dispersed throughout the central region, with a notable concentration of stops with darker shades of red signifying higher concentrations. Outlying regions, along the coastal areas, show fewer bus stops as indicated by the presence of lighter-colored hexagons or even white spaces where no stops are present. This distribution pattern suggests a robust public transportation infrastructure in urban and densely populated areas, tapering off in less populated or industrial regions.
1.5 Explore data
st_geometry() displays basic information of the feature class, such as type of geometry, the geographic extent of the features and the coordinate system of the data. glimpse() transposes the columns in a dataset and makes it possible to see the column name, data type and values in every column in a data frame.
st_geometry(busstops)Geometry set for 5161 features
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21 / Singapore TM
First 5 geometries:
glimpse(busstops)Rows: 5,161
Columns: 4
$ BUS_STOP_N <chr> "22069", "32071", "44331", "96081", "11561", "66191", "2338…
$ BUS_ROOF_N <chr> "B06", "B23", "B01", "B05", "B05", "B03", "B02A", "B02", "B…
$ LOC_DESC <chr> "OPP CEVA LOGISTICS", "AFT TRACK 13", "BLK 239", "GRACE IND…
$ geometry <POINT [m]> POINT (13576.31 32883.65), POINT (13228.59 44206.38),…
odbus$ORIGIN_PT_CODE <- as.factor(odbus$ORIGIN_PT_CODE)
odbus$DESTINATION_PT_CODE <- as.factor(odbus$DESTINATION_PT_CODE)
glimpse(odbus)Rows: 5,694,297
Columns: 7
$ YEAR_MONTH <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2023-…
$ DAY_TYPE <chr> "WEEKENDS/HOLIDAY", "WEEKDAY", "WEEKENDS/HOLIDAY",…
$ TIME_PER_HOUR <dbl> 16, 16, 14, 14, 17, 17, 17, 7, 14, 14, 10, 20, 20,…
$ PT_TYPE <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE <fct> 04168, 04168, 80119, 80119, 44069, 20281, 20281, 1…
$ DESTINATION_PT_CODE <fct> 10051, 10051, 90079, 90079, 17229, 20141, 20141, 1…
$ TOTAL_TRIPS <dbl> 3, 5, 3, 5, 4, 1, 24, 2, 1, 7, 3, 2, 5, 1, 1, 1, 1…
glimpse(pop)Rows: 738,492
Columns: 7
$ PA <chr> "Ang Mo Kio", "Ang Mo Kio", "Ang Mo Kio", "Ang Mo Kio", "Ang Mo K…
$ SZ <chr> "Ang Mo Kio Town Centre", "Ang Mo Kio Town Centre", "Ang Mo Kio T…
$ AG <chr> "0_to_4", "0_to_4", "0_to_4", "0_to_4", "0_to_4", "0_to_4", "0_to…
$ Sex <chr> "Males", "Males", "Males", "Males", "Males", "Males", "Females", …
$ FA <chr> "<= 60", ">60 to 80", ">80 to 100", ">100 to 120", ">120", "Not A…
$ Pop <dbl> 0, 10, 30, 80, 20, 0, 0, 10, 40, 90, 10, 0, 0, 10, 30, 110, 30, 0…
$ Time <dbl> 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011,…
glimpse(mpsz)Rows: 332
Columns: 7
$ SUBZONE_N <chr> "MARINA EAST", "INSTITUTION HILL", "ROBERTSON QUAY", "JURON…
$ SUBZONE_C <chr> "MESZ01", "RVSZ05", "SRSZ01", "WISZ01", "MUSZ02", "MPSZ05",…
$ PLN_AREA_N <chr> "MARINA EAST", "RIVER VALLEY", "SINGAPORE RIVER", "WESTERN …
$ PLN_AREA_C <chr> "ME", "RV", "SR", "WI", "MU", "MP", "WI", "WI", "SI", "SI",…
$ REGION_N <chr> "CENTRAL REGION", "CENTRAL REGION", "CENTRAL REGION", "WEST…
$ REGION_C <chr> "CR", "CR", "CR", "WR", "CR", "CR", "WR", "WR", "CR", "CR",…
$ geometry <MULTIPOLYGON [m]> MULTIPOLYGON (((33222.98 29..., MULTIPOLYGON (…
1.5 Plot data
mapview creates interactive visualisations of spatial data to examine and visually investigate both aspects of spatial data, the geometries and their attributes.
plot() takes parameters for specifying points in the diagram. At its simplest, it can plot two numbers against each other. With datasets, it can generate maps and plot the specified columns/attributes, with default up to nine plots or maximum all plots.
mapview(busstops, cex = 3, alpha = 0.5, popup = NULL)# Filter hexagons that contain more than 8 bus stops
hexagon_red = filter(hexagon_grid_sf, grid_id>8)
tmap_mode("view")
redhex = tm_shape(hexagon_red)+
tm_fill(
col = "grid_id",
palette = "Reds",
style = "cont",
title = "Number of Bus Stops",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.format = list(
grid_id = list(format = "f", digits = 0)
)
)+
tm_borders(col = "grey40", lwd = 0.7)
redhexSembawang MRT (9 bus stops) 
Pasir Ris (11 bus stops) 
# Summarize the total trips by day type
total_trips <- odbus %>%
group_by(DAY_TYPE) %>%
summarise(total = sum(TOTAL_TRIPS))
# Plot using ggplot2
ggplot(total_trips, aes(x = DAY_TYPE, y = total, fill = DAY_TYPE)) +
geom_bar(stat = "identity") +
theme_minimal() +
labs(title = "Total Trips by Day Type", x = "Day Type", y = "Total Trips")
popdata2020 <- pop %>%
filter(Time == 2020) %>%
group_by(PA, SZ, AG) %>%
summarise(`POP` = sum(`Pop`)) %>%
ungroup() %>%
pivot_wider(names_from=AG, values_from=POP) %>%
mutate(YOUNG = rowSums(.[3:6]) + rowSums(.[12])) %>%
mutate(`ECONOMY ACTIVE` = rowSums(.[7:11]) + rowSums(.[13:15])) %>%
mutate(`AGED`=rowSums(.[16:21])) %>%
mutate(`TOTAL`=rowSums(.[3:21])) %>%
mutate(`DEPENDENCY` = (`YOUNG` + `AGED`)/`ECONOMY ACTIVE`) %>%
select(`PA`, `SZ`, `YOUNG`, `ECONOMY ACTIVE`, `AGED`, `TOTAL`, `DEPENDENCY`)plot(mpsz["PLN_AREA_N"])
1.6 Geovisualisation and Analysis
1.6.2 Compute the passenger trips generated by origin
Step 1:
Classify by time interval
# Function to assign peak and non-peak times
time_interval <- function(day_type, time_per_hour) {
if (day_type == "WEEKDAY") {
if (time_per_hour >= 6 & time_per_hour <= 9) {
"morning peak"
} else if (time_per_hour >= 17 & time_per_hour <= 20) {
"afternoon peak"
} else {
"non peak"
}
} else if (day_type == "WEEKENDS/HOLIDAY") {
if (time_per_hour >= 11 & time_per_hour <= 14) {
"morning peak"
} else if (time_per_hour >= 16 & time_per_hour <= 19) {
"evening peak"
} else {
"non peak"
}
} else {
"non peak"
}
}
# Assuming 'odbus' is your data frame
odbus$TIME <- mapply(time_interval, odbus$DAY_TYPE, odbus$TIME_PER_HOUR)
# Checking the first few rows of the data frame to verify the new column
head(odbus)# A tibble: 6 × 8
YEAR_MONTH DAY_TYPE TIME_PER_H…¹ PT_TYPE ORIGI…² DESTI…³ TOTAL…⁴ TIME
<chr> <chr> <dbl> <chr> <fct> <fct> <dbl> <chr>
1 2023-10 WEEKENDS/HOLIDAY 16 BUS 04168 10051 3 even…
2 2023-10 WEEKDAY 16 BUS 04168 10051 5 non …
3 2023-10 WEEKENDS/HOLIDAY 14 BUS 80119 90079 3 morn…
4 2023-10 WEEKDAY 14 BUS 80119 90079 5 non …
5 2023-10 WEEKDAY 17 BUS 44069 17229 4 afte…
6 2023-10 WEEKENDS/HOLIDAY 17 BUS 20281 20141 1 even…
# … with abbreviated variable names ¹TIME_PER_HOUR, ²ORIGIN_PT_CODE,
# ³DESTINATION_PT_CODE, ⁴TOTAL_TRIPS
Step 2:
passengertrips <- left_join(busstops, odbus,
by = c("BUS_STOP_N" = "ORIGIN_PT_CODE"))Determine weekday or weekend
passengertrips_day <- passengertrips %>%
group_by(BUS_STOP_N, BUS_ROOF_N, LOC_DESC, YEAR_MONTH, geometry) %>%
summarise(
WEEKDAY_TRIPS = sum(TOTAL_TRIPS[DAY_TYPE == "WEEKDAY"], na.rm = TRUE),
WEEKENDS_HOLIDAYS_TRIPS = sum(TOTAL_TRIPS[DAY_TYPE != "WEEKDAY"], na.rm = TRUE),
.groups = "drop"
)Timing of Day
Determine peak or non peak
passengertrips_time <- passengertrips %>%
group_by(BUS_STOP_N, BUS_ROOF_N, LOC_DESC, YEAR_MONTH, geometry) %>%
summarise(
WEEKDAY_MORNING_PEAK = sum(TOTAL_TRIPS[DAY_TYPE == "WEEKDAY" & TIME == "morning peak"], na.rm = TRUE),
WEEKDAY_AFTERNOON_PEAK = sum(TOTAL_TRIPS[DAY_TYPE == "WEEKDAY" & TIME == "afternoon peak"], na.rm = TRUE),
WEEKDAY_NON_PEAK = sum(TOTAL_TRIPS[DAY_TYPE == "WEEKDAY" & TIME == "non peak"], na.rm = TRUE),
WEEKENDS_HOLIDAYS_MORNING_PEAK = sum(TOTAL_TRIPS[DAY_TYPE != "WEEKDAY" & TIME == "morning peak"], na.rm = TRUE),
WEEKENDS_HOLIDAYS_EVENING_PEAK = sum(TOTAL_TRIPS[DAY_TYPE != "WEEKDAY" & TIME == "evening peak"], na.rm = TRUE),
WEEKENDS_HOLIDAYS_NON_PEAK = sum(TOTAL_TRIPS[DAY_TYPE != "WEEKDAY" & TIME == "non peak"], na.rm = TRUE),
.groups = "drop"
)
passengertrips_time <- passengertrips_time %>%
filter(!(WEEKDAY_MORNING_PEAK == 0
& WEEKDAY_AFTERNOON_PEAK == 0
& WEEKDAY_NON_PEAK == 0
& WEEKENDS_HOLIDAYS_MORNING_PEAK == 0
& WEEKENDS_HOLIDAYS_EVENING_PEAK == 0
& WEEKENDS_HOLIDAYS_NON_PEAK == 0))Step 3: Ensure both datasets are in the same coordinate reference system (CRS).
st_crs(passengertrips_day)Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
st_crs(passengertrips_time)Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
st_crs(hexagon_grid_sf)Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
Step 4: Perform a spatial join to match trips to hexagons
passengergrid_day <- st_join(hexagon_grid_sf, passengertrips_day, join = st_intersects)# Remove rows with no trips (NA values)
passengergrid_day <- passengergrid_day %>%
filter(!is.na(BUS_STOP_N))
glimpse(passengergrid_day)Rows: 5,161
Columns: 8
$ grid_id <int> 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 2, 2, 2, 2, 1, 1…
$ BUS_STOP_N <chr> "25059", "25751", "26379", "25761", "25719", "…
$ BUS_ROOF_N <chr> "UNK", "B02D", "NIL", "B03", "B01C", "NIL", "N…
$ LOC_DESC <chr> "AFT TUAS STH BLVD", "BEF TUAS STH AVE 14", "Y…
$ YEAR_MONTH <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2…
$ WEEKDAY_TRIPS <dbl> 526, 468, 1055, 3420, 4584, 944, 689, 768, 644…
$ WEEKENDS_HOLIDAYS_TRIPS <dbl> 105, 96, 247, 816, 1688, 379, 248, 147, 108, 3…
$ area_hexagon_grid <POLYGON [m]> POLYGON ((3970.122 27925.48..., POLYGO…
passengergrid_time <- st_join(hexagon_grid_sf, passengertrips_time, join = st_intersects)# Remove rows with no trips (NA values)
passengergrid_time <- passengergrid_time %>%
filter(!is.na(BUS_STOP_N))
glimpse(passengergrid_time)Rows: 5,029
Columns: 12
$ grid_id <int> 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 2, 2, 2, …
$ BUS_STOP_N <chr> "25059", "25751", "26379", "25761", "25…
$ BUS_ROOF_N <chr> "UNK", "B02D", "NIL", "B03", "B01C", "N…
$ LOC_DESC <chr> "AFT TUAS STH BLVD", "BEF TUAS STH AVE …
$ YEAR_MONTH <chr> "2023-10", "2023-10", "2023-10", "2023-…
$ WEEKDAY_MORNING_PEAK <dbl> 103, 52, 78, 185, 815, 301, 53, 60, 64,…
$ WEEKDAY_AFTERNOON_PEAK <dbl> 390, 114, 291, 1905, 2600, 299, 241, 36…
$ WEEKDAY_NON_PEAK <dbl> 33, 302, 686, 1330, 1169, 344, 395, 340…
$ WEEKENDS_HOLIDAYS_MORNING_PEAK <dbl> 0, 26, 52, 187, 367, 88, 76, 45, 21, 39…
$ WEEKENDS_HOLIDAYS_EVENING_PEAK <dbl> 56, 14, 100, 346, 533, 101, 55, 49, 53,…
$ WEEKENDS_HOLIDAYS_NON_PEAK <dbl> 49, 56, 95, 283, 788, 190, 117, 53, 34,…
$ area_hexagon_grid <POLYGON [m]> POLYGON ((3970.122 27925.48...,…
Step 5: Split passengers trip into weekday and weekend
::: panel-tabset
Day of Week
# Subset for Weekday
weekday_trips <- passengergrid_day %>%
group_by(BUS_STOP_N) %>%
summarise(
weekday_trips = sum(WEEKDAY_TRIPS, na.rm = TRUE),
)
# Subset for Weekend
weekend_trips <- passengergrid_day %>%
group_by(BUS_STOP_N) %>%
summarise(
weekend_trips = sum(WEEKENDS_HOLIDAYS_TRIPS, na.rm = TRUE),
)Time of Day
# First, ensure all necessary columns are present in the dataframe
passengergrid_clean <- passengergrid_time %>%
mutate(
WEEKDAY_MORNING_PEAK = ifelse(is.na(WEEKDAY_MORNING_PEAK), 0, WEEKDAY_MORNING_PEAK),
WEEKDAY_AFTERNOON_PEAK = ifelse(is.na(WEEKDAY_AFTERNOON_PEAK), 0, WEEKDAY_AFTERNOON_PEAK),
WEEKENDS_HOLIDAYS_MORNING_PEAK = ifelse(is.na(WEEKENDS_HOLIDAYS_MORNING_PEAK), 0, WEEKENDS_HOLIDAYS_MORNING_PEAK),
WEEKENDS_HOLIDAYS_EVENING_PEAK = ifelse(is.na(WEEKENDS_HOLIDAYS_EVENING_PEAK), 0, WEEKENDS_HOLIDAYS_EVENING_PEAK)
)
# Function to summarise and filter bus stops with no trips
summarise_and_filter <- function(data, column) {
data %>%
group_by(BUS_STOP_N) %>%
summarise(Total_Trips = sum({{ column }}, na.rm = TRUE)) %>%
filter(Total_Trips > 0) %>%
ungroup()
}
# Create subsets using the function
weekday_morning_peak <- summarise_and_filter(passengergrid_clean, WEEKDAY_MORNING_PEAK)
weekday_afternoon_peak <- summarise_and_filter(passengergrid_clean, WEEKDAY_AFTERNOON_PEAK)
weekend_morning_peak <- summarise_and_filter(passengergrid_clean, WEEKENDS_HOLIDAYS_MORNING_PEAK)
weekend_evening_peak <- summarise_and_filter(passengergrid_clean, WEEKENDS_HOLIDAYS_EVENING_PEAK)
# Check for any BUS_STOP_N that might still have issues
problematic_stops <- passengergrid_clean %>%
filter(is.na(BUS_STOP_N)) %>%
pull(BUS_STOP_N) %>%
unique()
# If problematic_stops has any values, you may need to address these specifically,
# for example by removing them from passengergrid_clean before creating the subsets
if (length(problematic_stops) > 0) {
passengergrid_clean <- passengergrid_clean %>%
filter(!(BUS_STOP_N %in% problematic_stops))
}# Convert your data to an sf object if it's not one already
weekday_sf <- st_as_sf(weekday_trips, coords = c("longitude", "latitude"), crs = 4326)
# Calculate breaks at intervals of 25,000
max_trips <- max(weekday_sf$weekday_trips, na.rm = TRUE)
breaks <- c(1, 2500, 5000, 7500, 10000, 25000, 50000, Inf)
# Print the map
tmap_mode("view")
tm_shape(weekday_sf) +
tm_polygons("weekday_trips",
palette = "Blues",
border.col = "grey40",
breaks = breaks,
title = "Weekday Trips") +
tm_scale_bar()+
tm_layout(main.title = "Weekday Trips by Bus Stop",
main.title.position = "center",
frame = FALSE)# Convert your data to an sf object if it's not one already
weekday_morning_peak_sf <- st_as_sf(weekday_morning_peak, coords = c("longitude", "latitude"), crs = 4326)
# Calculate breaks at intervals of 25,000
max_trips <- max(weekday_morning_peak_sf$Total_Trips, na.rm = TRUE)
breaks <- c(1, 2500, 5000, 7500, 10000, 25000, 50000, Inf)
# Print the map
tmap_mode("view")
tm_shape(weekday_morning_peak_sf) +
tm_polygons("Total_Trips",
palette = "Blues",
border.col = "grey40",
breaks = breaks,
title = "weekday_morning_peak Trips") +
tm_scale_bar()+
tm_layout(main.title = "weekday_morning_peak Trips by Bus Stop",
main.title.position = "center",
frame = FALSE)# Convert your data to an sf object if it's not one already
weekday_afternoon_peak_sf <- st_as_sf(weekday_afternoon_peak, coords = c("longitude", "latitude"), crs = 4326)
# Calculate breaks at intervals of 25,000
max_trips <- max(weekday_afternoon_peak_sf$Total_Trips, na.rm = TRUE)
breaks <- c(1, 2500, 5000, 7500, 10000, 25000, 50000, Inf)
# Print the map
tmap_mode("view")
tm_shape(weekday_afternoon_peak_sf) +
tm_polygons("Total_Trips",
palette = "Blues",
border.col = "grey40",
breaks = breaks,
title = "Weekday Afternoon Peak Trips") +
tm_scale_bar()+
tm_layout(main.title = "Weekday Afternoon Peak Trips by Bus Stop",
main.title.position = "center",
frame = FALSE)# Convert your data to an sf object if it's not one already
weekend_sf <- st_as_sf(weekend_trips, coords = c("longitude", "latitude"), crs = 4326)
# Calculate breaks at intervals of 25,000
max_trips <- max(weekend_sf$weekend_trips, na.rm = TRUE)
breaks <- c(1, 2500, 5000, 7500, 10000, 25000, 50000, Inf)
# Print the map
tmap_mode("view")
tm_shape(weekend_sf) +
tm_polygons("weekend_trips",
palette = "Purples",
border.col = "grey40",
breaks = breaks,
title = "Weekend Trips") +
tm_scale_bar()+
tm_layout(main.title = "Weekend Trips by Bus Stop",
main.title.position = "center",
frame = FALSE)Weekend morning peaks
# Convert your data to an sf object if it's not one already
weekend_morning_peak_sf <- st_as_sf(weekend_morning_peak, coords = c("longitude", "latitude"), crs = 4326)
# Calculate breaks at intervals of 25,000
max_trips <- max(weekend_morning_peak_sf$Total_Trips, na.rm = TRUE)
breaks <- c(1, 2500, 5000, 7500, 10000, 25000, 50000, Inf)
# Print the map
tmap_mode("view")
tm_shape(weekend_morning_peak_sf) +
tm_polygons("Total_Trips",
palette = "Purples",
border.col = "grey40",
breaks = breaks,
title = "Weekend Morning Peak Trips") +
tm_scale_bar()+
tm_layout(main.title = "Weekend Morning Peak Trips by Bus Stop",
main.title.position = "center",
frame = FALSE)# Convert your data to an sf object if it's not one already
weekend_evening_peak_sf <- st_as_sf(weekend_evening_peak, coords = c("longitude", "latitude"), crs = 4326)
# Calculate breaks at intervals of 25,000
max_trips <- max(weekend_evening_peak_sf$Total_Trips, na.rm = TRUE)
breaks <- c(1, 2500, 5000, 7500, 10000, 25000, 50000, Inf)
# Print the map
tmap_mode("view")
tm_shape(weekend_evening_peak_sf) +
tm_polygons("Total_Trips",
palette = "Purples",
border.col = "grey40",
breaks = breaks,
title = "Weekend Evening Peak Trips") +
tm_scale_bar()+
tm_layout(main.title = "Weekend Evening Peak Trips by Bus Stop",
main.title.position = "center",
frame = FALSE)1.6 Local Indicators of Spatial Association (LISA) Analysis
1.6.1 Compute LISA of the passengers trips generate by origin at hexagon level.
# Remove rows with NA in BUS_STOP_N and calculate TOTAL_TRIPS
passengergrid_total <- passengergrid_day %>%
mutate(TOTAL_TRIPS = WEEKDAY_TRIPS + WEEKENDS_HOLIDAYS_TRIPS)wm_q <- poly2nb(passengergrid_total,
queen=TRUE)
summary(wm_q)Neighbour list object:
Number of regions: 5161
Number of nonzero links: 112788
Percentage nonzero weights: 0.4234432
Average number of links: 21.8539
7 regions with no links:
1767 2407 2884 3350 4627 5154 5222
Link number distribution:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
7 12 17 27 36 89 67 103 85 71 125 135 105 140 150 139 189 191 222 153
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
186 161 244 261 220 212 200 205 196 173 117 128 135 117 114 69 62 49 34 50
40 41 42 43 44 45 46 49
46 52 11 25 13 6 4 8
12 least connected regions:
34 251 313 1543 3507 3507.1 5129 5159 5278 5278.1 5558 5558.1 with 1 link
8 most connected regions:
3292 3292.1 3292.2 3292.3 3292.4 3292.5 3292.6 3292.7 with 49 links
#rswm_q <- nb2listw(wm_q,
# style="W")
#rswm_q# Remove NA values from WEEKDAY_TRIPS
passengergrid_total <- passengergrid_total[!is.na(passengergrid_total$WEEKDAY_TRIPS),]
# Compute local Moran's I
#localIMI_weekday <- localmoran(passengergrid_total$WEEKDAY_TRIPS, rswm_q)
# View the results
#head(localIMI_weekday)1.6.2 Display the LISA maps of the passengers trips generate by origin at hexagon level.
1.7 Emerging Hot Spot Analysis (EHSA)
longitude <- map_dbl(passengertrips_time$geometry, ~st_centroid(.x)[[1]])latitude <- map_dbl(passengertrips_time$geometry, ~st_centroid(.x)[[2]])coords <- cbind(longitude, latitude)#coords <- coordinates(hunan)
k1 <- knn2nb(knearneigh(coords))
k1dists <- unlist(nbdists(k1, coords, longlat = TRUE))
summary(k1dists) Min. 1st Qu. Median Mean 3rd Qu. Max.
0 3210 4940 6217 8403 19777
wm_d62 <- dnearneigh(coords, 0, 62, longlat = TRUE)# wm62_lw <- nb2listw(wm_d62, style = 'W')
# summary(wm62_lw)knn <- knn2nb(knearneigh(coords, k=8))knn_lw <- nb2listw(knn, style = 'B')